Big Data and Statistics: a Statistician's Perspective.

نویسنده

  • David Rossell
چکیده

Big Data brings unprecedented power to address scientific, economic and societal issues, but also amplifies the possibility of certain pitfalls. These include using purely data-driven approaches that disregard understanding the phenomenon under study, aiming at a dynamically moving target, ignoring critical data collection issues, summarizing or preprocessing the data inadequately and mistaking noise for signal. We review some success stories and illustrate how statistical principles can help obtain more reliable information from data. We also touch upon current challenges that require active methodological research, such as strategies for efficient computation, integration of heterogeneous data, extending the underlying theory to increasingly complex questions and, perhaps most importantly, training a new generation of scientists to develop and deploy these strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery

this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...

متن کامل

Criteria for authorship for statisticians in medical papers.

We organize a statistician's potential scientific and intellectual contributions to a medical study into three types of activities relating to design, implementation and analysis. For each type, we describe high-level, mid-level and low-level contributions. Using this framework, we develop a point system to assess whether authorship is justified. Although we recommend discussion and resolution ...

متن کامل

A statistician’s perspective on digital epidemiology

We address the question "does digital epidemiology represent an epistemic shift in infectious disease epidemiology" from a statistician's viewpoint. Our main argument is that infectious disease epidemiology has not changed fundamentally as it always has been data-driven. However, as the data aspect has become more prominent, we discuss the statistical toolbox of the modern epidemiologist and ar...

متن کامل

The hope and the hazards of using compliance data in randomized controlled trials.

This paper aims to elucidate both the advantages and limitations of using compliance data in the reporting of treatment differences in clinical trials, illustrating the issues with some recent examples. While analysis by intention-to-treat should remain the principal reporting approach for most major clinical trials, arguments are put forward as to why supplementary analyses taking account of c...

متن کامل

Bayesian Modeling Based on Data from the Internet of Things

The Internet of Things is suggested as the upcoming revolution in the Information and communication technology due to its very high capability of making various businesses and industries more productive and efficient. This productivity comes from the emergence of innovation and the introduction of new capabilities for businesses. Different industries have shown varying reactions to IOT, but wha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Metode science studies journal

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2015